Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DataDriftTrigger: support one Evidently metric #409

Merged

Conversation

jenny011
Copy link
Collaborator

@jenny011 jenny011 commented Apr 30, 2024

This is a clean version of PR#367.

  1. Add DataDriftTrigger class to supervisor. Supports one configurable Evidently metric. Launches drift detection every N new data points. Data used in detection are data trained in the previous trigger and all the untriggered new data.
  2. Update Trigger interface. Trigger.inform() returns a Generator instead of List.
  3. Add a generic ModelDownloader in supervisor.
  4. Add example pipelines using DataDriftTrigger.
  5. Add Evidently to pylint known third party.
  6. Change ModelDownloader to embedding encoder utils. The downloader sets up and returns the model. The DataDriftTrigger owns the model.

Future

  1. Support multiple configurable Evidently metric. DataDriftTrigger: Support multiple metrics in trigger config and support combined metrics in drift detection #416
  2. Support Alibi-Detect. DataDriftTrigger: Integrate the drift detection library Alibi-Detect #414
  3. Support custom embedding encoder. DataDriftTrigger: Support custom embedding encoder model #417
  4. Support different windowing for detection data, e.g. compare with all previously trained data. DataDriftTrigger: Reference and current data windows for drift detection #418
  5. Common DataLoaderInfo Common DataLoaderInfo #415

Copy link

Line Coverage: -% ( % to main)
Branch Coverage: -% ( % to main)

Copy link
Contributor

@MaxiBoether MaxiBoether left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jingyi! I did not have big comments, mostly smaller code-style changes and documentation issues. I think it might still be sensible to merge the downloader model thing first instead of extracting it later (probably overall less work) but we can do it however you prefer. As mentioned on Element, let's discuss how you proceed with the other TODOs. We definitely need the unit tests in this PR before merging :)

Copy link

github-actions bot commented May 13, 2024

✅ Result of Pytest Coverage

---------- coverage: platform linux, python 3.12.3-final-0 -----------

Name Stmts Miss Cover
modyn/common/benchmark/stopwatch.py 26 0 100%
modyn/common/example_extension/example_extension.py 28 2 93%
modyn/common/ftp/ftp_server.py 31 18 42%
modyn/common/ftp/ftp_utils.py 83 69 17%
modyn/common/grpc/grpc_helpers.py 67 36 46%
modyn/common/trigger_sample/trigger_sample_storage.py 158 9 94%
modyn/config/schema/config.py 89 0 100%
modyn/config/schema/pipeline.py 145 4 97%
modyn/database/abstract_database_connection.py 35 0 100%
modyn/database/partition_by_meta.py 33 12 64%
modyn/evaluator/evaluator.py 15 0 100%
modyn/evaluator/evaluator_entrypoint.py 32 3 91%
modyn/evaluator/internal/dataset/evaluation_dataset.py 76 3 96%
modyn/evaluator/internal/grpc/evaluator_grpc_server.py 22 0 100%
modyn/evaluator/internal/grpc/evaluator_grpc_servicer.py 161 16 90%
modyn/evaluator/internal/metric_factory.py 18 1 94%
modyn/evaluator/internal/metrics/abstract_decomposable_metric.py 10 1 90%
modyn/evaluator/internal/metrics/abstract_evaluation_metric.py 29 2 93%
modyn/evaluator/internal/metrics/abstract_holistic_metric.py 10 1 90%
modyn/evaluator/internal/metrics/accuracy.py 20 2 90%
modyn/evaluator/internal/metrics/f1_score.py 63 0 100%
modyn/evaluator/internal/metrics/roc_auc.py 36 1 97%
modyn/evaluator/internal/pytorch_evaluator.py 113 28 75%
modyn/evaluator/internal/utils/evaluation_info.py 9 0 100%
modyn/evaluator/internal/utils/evaluation_process_info.py 8 0 100%
modyn/evaluator/internal/utils/evaluator_messages.py 3 0 100%
modyn/metadata_database/metadata_base.py 3 0 100%
modyn/metadata_database/metadata_database_connection.py 55 3 95%
modyn/metadata_database/models/pipelines.py 22 1 95%
modyn/metadata_database/models/sample_training_metadata.py 15 0 100%
modyn/metadata_database/models/selector_state_metadata.py 47 10 79%
modyn/metadata_database/models/trained_models.py 18 0 100%
modyn/metadata_database/models/trigger_partitions.py 10 0 100%
modyn/metadata_database/models/trigger_training_metadata.py 14 0 100%
modyn/metadata_database/models/triggers.py 10 0 100%
modyn/metadata_database/utils/model_storage_strategy_config.py 21 3 86%
modyn/metadata_processor/internal/grpc/metadata_processor_grpc_servicer.py 18 0 100%
modyn/metadata_processor/internal/grpc/metadata_processor_server.py 24 0 100%
modyn/metadata_processor/internal/metadata_processor_manager.py 23 4 83%
modyn/metadata_processor/metadata_processor.py 11 0 100%
modyn/metadata_processor/metadata_processor_entrypoint.py 24 1 96%
modyn/metadata_processor/processor_strategies/abstract_processor_strategy.py 30 0 100%
modyn/metadata_processor/processor_strategies/basic_processor_strategy.py 17 2 88%
modyn/metadata_processor/processor_strategies/processor_strategy_type.py 6 1 83%
modyn/model_storage/internal/grpc/grpc_server.py 23 0 100%
modyn/model_storage/internal/grpc/model_storage_grpc_servicer.py 54 0 100%
modyn/model_storage/internal/model_storage_manager.py 118 5 96%
modyn/model_storage/internal/storage_strategies/abstract_difference_operator.py 11 2 82%
modyn/model_storage/internal/storage_strategies/abstract_model_storage_strategy.py 16 1 94%
modyn/model_storage/internal/storage_strategies/difference_operators/sub_difference_operator.py 12 0 100%
modyn/model_storage/internal/storage_strategies/difference_operators/xor_difference_operator.py 14 0 100%
modyn/model_storage/internal/storage_strategies/full_model_strategies/abstract_full_model_strategy.py 26 2 92%
modyn/model_storage/internal/storage_strategies/full_model_strategies/binary_full_model.py 16 0 100%
modyn/model_storage/internal/storage_strategies/full_model_strategies/pytorch_full_model.py 15 0 100%
modyn/model_storage/internal/storage_strategies/incremental_model_strategies/abstract_incremental_model_strategy.py 26 10 62%
modyn/model_storage/internal/storage_strategies/incremental_model_strategies/weights_difference.py 99 1 99%
modyn/model_storage/internal/utils/model_storage_policy.py 35 0 100%
modyn/model_storage/model_storage.py 27 3 89%
modyn/model_storage/model_storage_entrypoint.py 32 3 91%
modyn/models/articlenet/articlenet.py 30 16 47%
modyn/models/coreset_methods_support.py 29 1 97%
modyn/models/dlrm/cuda_ext/dot_based_interact.py 24 13 46%
modyn/models/dlrm/cuda_ext/fused_gather_embedding.py 17 17 0%
modyn/models/dlrm/cuda_ext/sparse_embedding.py 32 32 0%
modyn/models/dlrm/dlrm.py 67 9 87%
modyn/models/dlrm/nn/embeddings.py 123 64 48%
modyn/models/dlrm/nn/factories.py 24 9 62%
modyn/models/dlrm/nn/interactions.py 50 11 78%
modyn/models/dlrm/nn/mlps.py 77 23 70%
modyn/models/dlrm/nn/parts.py 60 4 93%
modyn/models/dlrm/setup.py 5 5 0%
modyn/models/dlrm/utils/install_lib.py 11 7 36%
modyn/models/dlrm/utils/utils.py 28 0 100%
modyn/models/dummy/dummy.py 12 0 100%
modyn/models/fmownet/fmownet.py 25 0 100%
modyn/models/resnet18/resnet18.py 28 0 100%
modyn/models/resnet50/resnet50.py 28 0 100%
modyn/models/resnet152/resnet152.py 28 0 100%
modyn/models/tokenizers/distill_bert_tokenizer.py 11 0 100%
modyn/models/yearbooknet/yearbooknet.py 23 0 100%
modyn/selector/internal/grpc/selector_grpc_servicer.py 78 22 72%
modyn/selector/internal/grpc/selector_server.py 33 12 64%
modyn/selector/internal/selector_manager.py 130 43 67%
modyn/selector/internal/selector_strategies/abstract_selection_strategy.py 153 15 90%
modyn/selector/internal/selector_strategies/coreset_strategy.py 71 10 86%
modyn/selector/internal/selector_strategies/downsampling_strategies/abstract_downsampling_strategy.py 32 2 94%
modyn/selector/internal/selector_strategies/downsampling_strategies/craig_downsampling_strategy.py 14 9 36%
modyn/selector/internal/selector_strategies/downsampling_strategies/downsampling_scheduler.py 55 2 96%
modyn/selector/internal/selector_strategies/downsampling_strategies/gradmatch_downsampling_strategy.py 12 7 42%
modyn/selector/internal/selector_strategies/downsampling_strategies/gradnorm_downsampling_strategy.py 5 0 100%
modyn/selector/internal/selector_strategies/downsampling_strategies/kcentergreedy_downsampling_strategy.py 12 7 42%
modyn/selector/internal/selector_strategies/downsampling_strategies/loss_downsampling_strategy.py 5 0 100%
modyn/selector/internal/selector_strategies/downsampling_strategies/no_downsampling_strategy.py 14 2 86%
modyn/selector/internal/selector_strategies/downsampling_strategies/submodular_downsampling_strategy.py 19 13 32%
modyn/selector/internal/selector_strategies/downsampling_strategies/uncertainty_downsampling_strategy.py 15 10 33%
modyn/selector/internal/selector_strategies/downsampling_strategies/utils.py 10 0 100%
modyn/selector/internal/selector_strategies/freshness_sampling_strategy.py 133 14 89%
modyn/selector/internal/selector_strategies/new_data_strategy.py 102 11 89%
modyn/selector/internal/selector_strategies/presampling_strategies/abstract_balanced_strategy.py 56 0 100%
modyn/selector/internal/selector_strategies/presampling_strategies/abstract_presampling_strategy.py 26 2 92%
modyn/selector/internal/selector_strategies/presampling_strategies/label_balanced_presampling_strategy.py 6 0 100%
modyn/selector/internal/selector_strategies/presampling_strategies/no_presampling_strategy.py 16 1 94%
modyn/selector/internal/selector_strategies/presampling_strategies/random_no_replacement_presampling_strategy.py 41 0 100%
modyn/selector/internal/selector_strategies/presampling_strategies/random_presampling_strategy.py 16 0 100%
modyn/selector/internal/selector_strategies/presampling_strategies/trigger_balanced_presampling_strategy.py 12 1 92%
modyn/selector/internal/selector_strategies/presampling_strategies/utils.py 14 0 100%
modyn/selector/internal/storage_backend/abstract_storage_backend.py 32 6 81%
modyn/selector/internal/storage_backend/database/database_storage_backend.py 85 7 92%
modyn/selector/internal/storage_backend/local/local_storage_backend.py 136 5 96%
modyn/selector/selector.py 82 14 83%
modyn/selector/selector_entrypoint.py 31 3 90%
modyn/supervisor/entrypoint.py 31 3 90%
modyn/supervisor/internal/eval_strategies/abstract_eval_strategy.py 8 1 88%
modyn/supervisor/internal/eval_strategies/matrix_eval_strategy.py 21 0 100%
modyn/supervisor/internal/eval_strategies/offset_eval_strategy.py 19 0 100%
modyn/supervisor/internal/evaluation_result_writer/abstract_evaluation_result_writer.py 14 2 86%
modyn/supervisor/internal/evaluation_result_writer/json_result_writer.py 17 0 100%
modyn/supervisor/internal/evaluation_result_writer/log_result_writer.py 4 1 75%
modyn/supervisor/internal/evaluation_result_writer/tensorboard_result_writer.py 13 0 100%
modyn/supervisor/internal/grpc/enums.py 48 0 100%
modyn/supervisor/internal/grpc/supervisor_grpc_server.py 24 7 71%
modyn/supervisor/internal/grpc/supervisor_grpc_servicer.py 35 0 100%
modyn/supervisor/internal/grpc/template_msg.py 26 0 100%
modyn/supervisor/internal/grpc_handler.py 346 50 86%
modyn/supervisor/internal/pipeline_executor/pipeline_executor.py 269 25 91%
modyn/supervisor/internal/supervisor.py 191 26 86%
modyn/supervisor/internal/triggers/amounttrigger.py 16 0 100%
modyn/supervisor/internal/triggers/datadrifttrigger.py 134 33 75%
modyn/supervisor/internal/triggers/embedding_encoder_utils/embedding_encoder.py 30 19 37%
modyn/supervisor/internal/triggers/embedding_encoder_utils/embedding_encoder_downloader.py 50 31 38%
modyn/supervisor/internal/triggers/timetrigger.py 30 3 90%
modyn/supervisor/internal/triggers/trigger.py 14 0 100%
modyn/supervisor/internal/triggers/trigger_datasets/dataloader_info.py 15 12 20%
modyn/supervisor/internal/triggers/trigger_datasets/fixed_keys_dataset.py 73 3 96%
modyn/supervisor/internal/triggers/trigger_datasets/online_trigger_dataset.py 17 1 94%
modyn/supervisor/internal/triggers/utils.py 64 41 36%
modyn/supervisor/internal/utils/evaluation_status_reporter.py 31 0 100%
modyn/supervisor/internal/utils/pipeline_info.py 30 9 70%
modyn/supervisor/internal/utils/training_status_reporter.py 24 3 88%
modyn/tests/common/example_extension/test_example_extension.py 13 0 100%
modyn/tests/common/grpc/test_grpc_helpers.py 3 0 100%
modyn/tests/common/trigger_sample/test_trigger_sample_storage.py 128 0 100%
modyn/tests/database/test_abstract_database_connection.py 19 0 100%
modyn/tests/evaluator/internal/dataset/test_evaluation_dataset.py 131 2 98%
modyn/tests/evaluator/internal/grpc/test_evaluator_grpc_server.py 20 0 100%
modyn/tests/evaluator/internal/grpc/test_evaluator_grpc_servicer.py 317 16 95%
modyn/tests/evaluator/internal/metrics/test_accuracy.py 45 0 100%
modyn/tests/evaluator/internal/metrics/test_f1_score.py 53 0 100%
modyn/tests/evaluator/internal/metrics/test_roc_auc.py 31 0 100%
modyn/tests/evaluator/internal/test_metric_factory.py 13 0 100%
modyn/tests/evaluator/internal/test_pytorch_evaluator.py 163 19 88%
modyn/tests/evaluator/test_evaluator.py 30 0 100%
modyn/tests/evaluator/test_evaluator_entrypoint.py 22 0 100%
modyn/tests/metadata_database/models/test_pipelines.py 48 0 100%
modyn/tests/metadata_database/models/test_sample_training_metadata.py 40 0 100%
modyn/tests/metadata_database/models/test_selector_state_metadata.py 46 0 100%
modyn/tests/metadata_database/models/test_trained_models.py 48 0 100%
modyn/tests/metadata_database/models/test_trigger_training_metadata.py 38 0 100%
modyn/tests/metadata_database/models/test_triggers.py 33 0 100%
modyn/tests/metadata_database/test_metadata_database_connection.py 47 0 100%
modyn/tests/metadata_processor/internal/grpc/test_metadata_processor_grpc_servicer.py 26 0 100%
modyn/tests/metadata_processor/internal/grpc/test_metadata_processor_server.py 27 0 100%
modyn/tests/metadata_processor/internal/test_metadata_processor_manager.py 42 3 93%
modyn/tests/metadata_processor/processor_strategies/test_abstract_processor_strategy.py 60 0 100%
modyn/tests/metadata_processor/processor_strategies/test_basic_processor_strategy.py 43 0 100%
modyn/tests/metadata_processor/test_metadata_processor.py 22 3 86%
modyn/tests/metadata_processor/test_metadata_processor_entrypoint.py 22 0 100%
modyn/tests/model_storage/internal/grpc/test_model_storage_grpc_server.py 16 0 100%
modyn/tests/model_storage/internal/grpc/test_model_storage_grpc_servicer.py 100 0 100%
modyn/tests/model_storage/internal/storage_strategies/difference_operators/test_sub_difference_operator.py 16 0 100%
modyn/tests/model_storage/internal/storage_strategies/difference_operators/test_xor_difference_operator.py 16 0 100%
modyn/tests/model_storage/internal/storage_strategies/full_model_strategies/test_binary_full_model.py 27 1 96%
modyn/tests/model_storage/internal/storage_strategies/full_model_strategies/test_pytorch_full_model.py 36 1 97%
modyn/tests/model_storage/internal/storage_strategies/incremental_model_strategies/test_weights_difference.py 88 2 98%
modyn/tests/model_storage/internal/test_model_storage_manager.py 217 1 99%
modyn/tests/model_storage/internal/utils/test_model_storage_policy.py 28 0 100%
modyn/tests/model_storage/test_model_storage.py 37 0 100%
modyn/tests/model_storage/test_model_storage_entrypoint.py 22 0 100%
modyn/tests/models/test_bert_tokenizer.py 24 0 100%
modyn/tests/models/test_dlrm.py 46 0 100%
modyn/tests/models/test_dummy.py 8 0 100%
modyn/tests/models/test_embedding_recorder.py 27 0 100%
modyn/tests/models/test_fmownet.py 25 0 100%
modyn/tests/models/test_resnet18.py 22 0 100%
modyn/tests/models/test_resnet50.py 22 0 100%
modyn/tests/models/test_resnet152.py 22 0 100%
modyn/tests/models/test_yearbook_net.py 47 0 100%
modyn/tests/selector/internal/grpc/test_selector_grpc_servicer.py 132 0 100%
modyn/tests/selector/internal/grpc/test_selector_server.py 16 0 100%
modyn/tests/selector/internal/selector_strategies/downsampling_strategies/test_abstract_downsampling_strategy.py 15 0 100%
modyn/tests/selector/internal/selector_strategies/downsampling_strategies/test_gradnorm_downsampling_strategy.py 13 0 100%
modyn/tests/selector/internal/selector_strategies/downsampling_strategies/test_loss_downsampling_strategy.py 17 0 100%
modyn/tests/selector/internal/selector_strategies/downsampling_strategies/test_no_downsampling_strategy.py 5 0 100%
modyn/tests/selector/internal/selector_strategies/downsampling_strategies/test_scheduler.py 114 0 100%
modyn/tests/selector/internal/selector_strategies/presampling_strategies/test_abstract_balanced_strategy.py 14 0 100%
modyn/tests/selector/internal/selector_strategies/presampling_strategies/test_empty_presampling_strategy.py 0 0 100%
modyn/tests/selector/internal/selector_strategies/presampling_strategies/test_label_balanced_presampling_strategy.py 164 0 100%
modyn/tests/selector/internal/selector_strategies/presampling_strategies/test_random_no_replacement_presampling_strategy.py 51 0 100%
modyn/tests/selector/internal/selector_strategies/presampling_strategies/test_random_presampling_strategy.py 95 0 100%
modyn/tests/selector/internal/selector_strategies/presampling_strategies/test_trigger_balanced_presampling.py 139 0 100%
modyn/tests/selector/internal/selector_strategies/test_abstract_selection_strategy.py 171 0 100%
modyn/tests/selector/internal/selector_strategies/test_coreset_strategy.py 232 0 100%
modyn/tests/selector/internal/selector_strategies/test_freshness_sampling_strategy.py 308 0 100%
modyn/tests/selector/internal/selector_strategies/test_new_data_strategy.py 519 0 100%
modyn/tests/selector/internal/storage_backend/database/test_database_storage_backend.py 123 0 100%
modyn/tests/selector/internal/storage_backend/local/test_local_storage_backend.py 84 0 100%
modyn/tests/selector/internal/test_selector_manager.py 145 4 97%
modyn/tests/selector/test_selector.py 92 4 96%
modyn/tests/selector/test_selector_entrypoint.py 26 0 100%
modyn/tests/supervisor/internal/eval_strategies/test_matrix_eval_strategy.py 26 0 100%
modyn/tests/supervisor/internal/eval_strategies/test_offset_eval_strategy.py 8 0 100%
modyn/tests/supervisor/internal/evaluation_result_writer/test_abstract_evaluation_result_writer.py 7 0 100%
modyn/tests/supervisor/internal/evaluation_result_writer/test_json_result_writer.py 16 0 100%
modyn/tests/supervisor/internal/evaluation_result_writer/test_tensorboard_result_writer.py 21 0 100%
modyn/tests/supervisor/internal/grpc/test_supervisor_grpc_server.py 24 1 96%
modyn/tests/supervisor/internal/grpc/test_supervisor_grpc_servicer.py 51 0 100%
modyn/tests/supervisor/internal/pipeline_executor/test_pipeline_executor.py 397 6 98%
modyn/tests/supervisor/internal/test_grpc_handler.py 298 0 100%
modyn/tests/supervisor/internal/test_supervisor.py 230 5 98%
modyn/tests/supervisor/internal/triggers/test_amounttrigger.py 30 0 100%
modyn/tests/supervisor/internal/triggers/test_datadrifttrigger.py 109 2 98%
modyn/tests/supervisor/internal/triggers/test_timetrigger.py 26 0 100%
modyn/tests/supervisor/internal/triggers/test_trigger.py 5 0 100%
modyn/tests/supervisor/internal/triggers/trigger_datasets/test_fixed_keys_dataset.py 123 2 98%
modyn/tests/supervisor/internal/triggers/trigger_datasets/test_online_trigger_dataset.py 28 2 93%
modyn/tests/supervisor/test_entrypoint.py 26 0 100%
modyn/tests/trainer_server/internal/data/key_sources/test_local_key_source.py 89 0 100%
modyn/tests/trainer_server/internal/data/key_sources/test_selector_key_source.py 92 0 100%
modyn/tests/trainer_server/internal/data/test_data_utils.py 22 1 95%
modyn/tests/trainer_server/internal/data/test_local_dataset_writer.py 59 0 100%
modyn/tests/trainer_server/internal/data/test_online_dataset.py 347 3 99%
modyn/tests/trainer_server/internal/data/test_per_class_online_dataset.py 53 3 94%
modyn/tests/trainer_server/internal/grpc/test_trainer_server_grpc_server.py 17 0 100%
modyn/tests/trainer_server/internal/grpc/test_trainer_server_grpc_servicer.py 406 8 98%
modyn/tests/trainer_server/internal/metadata_collector/test_metadata_collector.py 41 0 100%
modyn/tests/trainer_server/internal/trainer/metadata_pytorch_callbacks/test_loss_callback.py 51 1 98%
modyn/tests/trainer_server/internal/trainer/remote_downsamplers/deepcore_comparison_tests_utils.py 21 1 95%
modyn/tests/trainer_server/internal/trainer/remote_downsamplers/test_abstract_matrix_downsampling_strategy.py 75 0 100%
modyn/tests/trainer_server/internal/trainer/remote_downsamplers/test_abstract_remote_downsampling_strategy.py 12 0 100%
modyn/tests/trainer_server/internal/trainer/remote_downsamplers/test_craig_remote_downsampling.py 249 0 100%
modyn/tests/trainer_server/internal/trainer/remote_downsamplers/test_get_tensor_subset.py 56 0 100%
modyn/tests/trainer_server/internal/trainer/remote_downsamplers/test_remote_gradmatch_downsampling_strategy.py 116 0 100%
modyn/tests/trainer_server/internal/trainer/remote_downsamplers/test_remote_gradnorm_downsample.py 92 0 100%
modyn/tests/trainer_server/internal/trainer/remote_downsamplers/test_remote_kcenter_downsampling_strategy.py 104 0 100%
modyn/tests/trainer_server/internal/trainer/remote_downsamplers/test_remote_loss_downsample.py 82 0 100%
modyn/tests/trainer_server/internal/trainer/remote_downsamplers/test_remote_submodular_downsampling_strategy.py 101 0 100%
modyn/tests/trainer_server/internal/trainer/remote_downsamplers/test_remote_uncertainty_downsampling_strategy.py 49 0 100%
modyn/tests/trainer_server/internal/trainer/test_pytorch_trainer.py 361 33 91%
modyn/tests/trainer_server/test_trainer_server.py 34 0 100%
modyn/tests/trainer_server/test_trainer_server_entrypoint.py 22 0 100%
modyn/tests/utils/test_utils.py 173 0 100%
modyn/trainer_server/custom_lr_schedulers/dlrm_lr_scheduler/dlrm_scheduler.py 33 33 0%
modyn/trainer_server/internal/dataset/data_utils.py 17 2 88%
modyn/trainer_server/internal/dataset/key_sources/abstract_key_source.py 21 5 76%
modyn/trainer_server/internal/dataset/key_sources/local_key_source.py 23 1 96%
modyn/trainer_server/internal/dataset/key_sources/selector_key_source.py 54 2 96%
modyn/trainer_server/internal/dataset/local_dataset_writer.py 55 3 95%
modyn/trainer_server/internal/dataset/online_dataset.py 276 28 90%
modyn/trainer_server/internal/dataset/per_class_online_dataset.py 14 0 100%
modyn/trainer_server/internal/grpc/trainer_server_grpc_server.py 22 0 100%
modyn/trainer_server/internal/grpc/trainer_server_grpc_servicer.py 244 38 84%
modyn/trainer_server/internal/metadata_collector/metadata_collector.py 33 0 100%
modyn/trainer_server/internal/mocks/mock_metadata_processor.py 22 2 91%
modyn/trainer_server/internal/trainer/metadata_pytorch_callbacks/base_callback.py 15 3 80%
modyn/trainer_server/internal/trainer/metadata_pytorch_callbacks/loss_callback.py 21 0 100%
modyn/trainer_server/internal/trainer/pytorch_trainer.py 492 159 68%
modyn/trainer_server/internal/trainer/remote_downsamplers/abstract_matrix_downsampling_strategy.py 66 4 94%
modyn/trainer_server/internal/trainer/remote_downsamplers/abstract_per_label_remote_downsample_strategy.py 9 1 89%
modyn/trainer_server/internal/trainer/remote_downsamplers/abstract_remote_downsampling_strategy.py 32 3 91%
modyn/trainer_server/internal/trainer/remote_downsamplers/deepcore_utils/cossim.py 28 17 39%
modyn/trainer_server/internal/trainer/remote_downsamplers/deepcore_utils/euclidean.py 29 12 59%
modyn/trainer_server/internal/trainer/remote_downsamplers/deepcore_utils/k_center_greedy.py 38 4 89%
modyn/trainer_server/internal/trainer/remote_downsamplers/deepcore_utils/orthogonal_matching_pursuit.py 66 34 48%
modyn/trainer_server/internal/trainer/remote_downsamplers/deepcore_utils/shuffling.py 9 0 100%
modyn/trainer_server/internal/trainer/remote_downsamplers/deepcore_utils/submodular_function.py 103 15 85%
modyn/trainer_server/internal/trainer/remote_downsamplers/deepcore_utils/submodular_optimizer.py 116 78 33%
modyn/trainer_server/internal/trainer/remote_downsamplers/remote_craig_downsampling.py 95 7 93%
modyn/trainer_server/internal/trainer/remote_downsamplers/remote_grad_match_downsampling_strategy.py 17 1 94%
modyn/trainer_server/internal/trainer/remote_downsamplers/remote_gradnorm_downsampling.py 42 5 88%
modyn/trainer_server/internal/trainer/remote_downsamplers/remote_kcenter_greedy_downsampling_strategy.py 15 0 100%
modyn/trainer_server/internal/trainer/remote_downsamplers/remote_loss_downsampling.py 34 5 85%
modyn/trainer_server/internal/trainer/remote_downsamplers/remote_submodular_downsampling_strategy.py 30 3 90%
modyn/trainer_server/internal/trainer/remote_downsamplers/remote_uncertainty_downsampling_strategy.py 61 18 70%
modyn/trainer_server/internal/utils/metric_type.py 3 0 100%
modyn/trainer_server/internal/utils/trainer_messages.py 4 0 100%
modyn/trainer_server/internal/utils/training_info.py 52 2 96%
modyn/trainer_server/internal/utils/training_process_info.py 10 0 100%
modyn/trainer_server/trainer_server.py 19 0 100%
modyn/trainer_server/trainer_server_entrypoint.py 32 3 91%
modyn/utils/utils.py 155 12 92%
TOTAL 17391 1583 91%
Coverage HTML written to
Required test coverage of
=============== 1860 passed, 5695

@MaxiBoether
Copy link
Contributor

Thanks for the work. Let me know when I should review again.

Copy link
Contributor

@MaxiBoether MaxiBoether left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jingyi, LGTM with the exception of one small abstractmethod thing. And I would like @XianzheMa to double check the changes on the triggering indices since we just changed some stuff there :) After he also approves and the abstract thing is fixed and CI runs through, we can merge!

Copy link
Collaborator

@XianzheMa XianzheMa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and can you address the small changes I requested before merging! Thanks!

…gers_within_batch. Add back persist_pipeline_log()
@jenny011 jenny011 merged commit 3d98d4d into main May 14, 2024
24 checks passed
robinholzi pushed a commit that referenced this pull request May 18, 2024
This is a clean version of PR#367.
1. Add DataDriftTrigger class to supervisor. Supports one configurable
Evidently metric. Launches drift detection every N new data points. Data
used in detection are data trained in the previous trigger and all the
untriggered new data.
2. Update Trigger interface. `Trigger.inform()` returns a Generator
instead of List.
3. Add a generic ModelDownloader in supervisor.
4. Add example pipelines using DataDriftTrigger.
5. Add Evidently to pylint known third party.
6. Change ModelDownloader to embedding encoder utils. The downloader
sets up and returns the model. The DataDriftTrigger owns the model.

Future
1. Support multiple configurable Evidently metric. #416
2. Support Alibi-Detect. #414 
3. Support custom embedding encoder. #417
4. Support different windowing for detection data, e.g. compare with all
previously trained data. #418
5. Common DataLoaderInfo #415
@XianzheMa XianzheMa deleted the JingyiZhu/feature/#366-data-drift-trigger-with-evidently branch June 12, 2024 14:44
@robinholzi robinholzi changed the title Add DataDriftTrigger: supports one Evidently metric Add DataDriftTrigger: support one Evidently metric Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement DataDriftTrigger using Evidently Implement DataDriftTrigger
3 participants